Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled jobs: add status message if cron not running #17819

Closed
wants to merge 3 commits into from

Conversation

agh1
Copy link
Contributor

@agh1 agh1 commented Jul 13, 2020

Overview

Following discussion on #17800 this adds a pop-up on the Scheduled Jobs page if cron does not appear to be running.

Before

You can go to the Scheduled Jobs page and enable things and wonder why they aren't running.

After

image
image

Technical Details

I also updated a few links to the Managing Scheduled Jobs documentation to use the new docs URL rather than the old wiki page name.

@civibot
Copy link

civibot bot commented Jul 13, 2020

(Standard links)

@civibot civibot bot added the master label Jul 13, 2020
@kcristiano
Copy link
Member

@agh1 Bravo!

This is perfect. message appears where it's needed.

I did an r-run and it works perfectly.

@eileenmcnaughton
Copy link
Contributor

Very form-layer so OK without a test. Based on @kcristiano r-run MOP

@colemanw
Copy link
Member

I'm actually working on a PR right now that would allow you to run specific checks instead of the current all-or-nothing approach.
Which means a lot of this code could be reduced to:

$status = \Civi\Api4\System::check()->addWhere('name', '=', 'checkLastCron')->execute();

The only difference between the output of that check and what you've done here is the wording of the messages, and I would argue your wording is better and should be what the system check says.

So if you don't mind waiting a bit for my PR, we can do this a little more cleanly.

@agh1
Copy link
Contributor Author

agh1 commented Jul 14, 2020

@colemanw that sounds useful from a lot of perspectives. Is the result from that APIv4 check something distinct from the wording and severity that comes from the check? My concern is that on the first day the site is live we'd want to retain the new kinder, gentler "notice" on the system check but have an unambiguous message on the Scheduled Jobs page that nothing's going to run because the cron job isn't working.

My other concern with relying on checks to drive other parts of the UI is that this could make it more brittle. Let's say @kcristiano has a change of heart and says, "You know, go big or go home: cron not running should be an emergency, not error." It's not going to be obvious to someone editing the system check that the Scheduled Jobs page is triggering this message based upon specific severities or messages.

To resolve this, I wonder if we could take a page from Icinga2 and return some data points from checks. Maybe the output of the check via API might be along the lines of:

[
    'name' => 'checkLastCron',
    'message' => 'Last cron run at July 13th, 2020  5:31 PM. To enable scheduling support, please set up the cron job.',
    'title' => 'Cron Not Running',
    'severity' => 'warning',
    'severity_id' => 3,
    'is_visible' => 1,
    'icon' => 'fa-clock-o',
    'id' => 3,
    'data' => [
        'lastrun' => 1594675860,
        'new' => FALSE,
    ],
]

The data item could have check-specific data. Yes, this shoves the logic back to the page as far as what to do with the data, but I think that's good; we want to do slightly different things.

Finally, just a note: the APIv4 method should probably retrieve the cached result with an option to force-check, which we'd want to use in the case of the Scheduled Jobs page since it's a cheap test and someone might want to refresh the page to confirm it's working.

@colemanw
Copy link
Member

We might be able to punt the data points for now because I think this will work fine regardless of the severity getting escalated. The wording can be the same in both contexts; the "grace period" softness comes from a reduced severity, not from alternate wording.

I would suggest:

  1. Merging SystemCheck: add ability to efficiently run only specified checks #17824
  2. Changing the wording of the system cron check to the improved wording in this PR
  3. In this PR, perform the check via api, (also adding includeDisabled = true), and display the returned message if the status is greater than INFO.

@colemanw
Copy link
Member

Finally, just a note: the APIv4 method should probably retrieve the cached result with an option to force-check, which we'd want to use in the case of the Scheduled Jobs page since it's a cheap test and someone might want to refresh the page to confirm it's working.

I'm not sure what you mean here. StatusCheck results aren't cached anywhere, ever, to my knowledge.

@seamuslee001
Copy link
Contributor

@colemanw it seems at least there was an attempt at caching status checks as per https://github.com/civicrm/civicrm-core/blob/master/CRM/Core/Invoke.php#L356

@colemanw
Copy link
Member

@colemanw it seems at least there was an attempt at caching status checks as per https://github.com/civicrm/civicrm-core/blob/master/CRM/Core/Invoke.php#L356

That just caches a single integer value: $maxSeverity

@agh1
Copy link
Contributor Author

agh1 commented Jul 14, 2020

@colemanw @seamuslee001 yeah now I remember more. The maxSeverity is cached because that's the only thing that's repeatedly retrieved (to show the status in the footer of each page). The thought was that if you're viewing the actual status page or retrieving the status via API you should get fresh results each time because you're proactively going there to look.

I recall @eileenmcnaughton saying there were some checks that have a big performance hit and maybe shouldn't run except periodically. I don't know what the resolution of that was, but I suspect there may be other intensive checks that shouldn't run every time you get the status. I can think of two ways to handle that:

  1. The API could use a cache for the messages, only fetching them again if they're stale or missing. There would need to be an API param to force a check, and there would need to be something in the UI to force that too.

  2. Certain checks could be turned into "checks of a check": much like the last cron timestamp is stashed whenever cron runs and the check checks that value, the more intensive checks could be turned into scheduled jobs that stash the result, and the check would evaluate that. (I suppose the version check is like this.) The message would need to show the last time the check was run, and the status page UI would need to show people how to execute the job to get a fresh result.

@agh1
Copy link
Contributor Author

agh1 commented Jul 15, 2020

Judging by the quantity of tabs open to civicrm-core PRs on my computer, I think we may be getting into a scope problem. Here's what I propose:

  1. Merging this PR on its own.
  2. Assuming the tests on SystemCheck: add ability to efficiently run only specified checks #17824 are unrelated, merging that.
  3. Having a third PR to possibly swap the language on the cron system check and replace code on the Scheduled Jobs page that could use an APIv4 single check instead.
  4. Considering what other spots could have an APIv4 check replace an existing ad hoc equivalent or generate a new status message in the right context.

I don't want the perfect to be the enemy of the good, and while I hope the third step will be easily doable this month, I could see someone having technical or wording concerns.

@eileenmcnaughton
Copy link
Contributor

@agh1 this needs a rebase
@colemanw any response to @agh1 's last comment?

@colemanw
Copy link
Member

I'm ok with merging this and then following up once #17824 is merged. I think that other PR is merge-ready and the bottleneck is our reviewer's time.

@eileenmcnaughton
Copy link
Contributor

@agh1 are you able to look at #17824 & figure out what more review it needs

@eileenmcnaughton
Copy link
Contributor

@agh1 I think this can be MOP based on re-reading it. It does need a rebase though

@eileenmcnaughton
Copy link
Contributor

I just tried the web ui for resolving conflicts but I don't think the commit history will come out very clean this way & jenkins hates it

@colemanw
Copy link
Member

colemanw commented Aug 4, 2020

Here's a reviewer's cut using the new API method: #18065

@colemanw colemanw closed this Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants